?BD 166 257 



TH 008 333 



^KOTHOE 

tPOB DATE 
NOTE 



EDHS' PRICE 
DESCRIPTORS 



IDENTIFIERS 



ABSTRACT 



Rudner^ Lawrence M« 

A Short and Simple Introduction to Tailored 
Te sting • - : 

Mar 78 

15p. ; Paper presented at the Annual Meeting of the 
Eastern Educational Research Association 
(Hilliamsburg^ Virginia, March, 1978) 

.MP-$0-83 HC-$1,67 Plus Postage^ 
Ability; Bayesian Statistics; *Individual Tests; 
♦Item analysis; *Item Banks; *Matheinatical Models; 
Test Construction; *Testing; Testing Problems; Test 
Items 

Adaptive resting; *Computer Assisted Testing; Fixed 
Steps Procedure; Flexiievel Tests; Eobbins Monro * 
Process; *Tailored Testing;^ Test Length 



Tailored testiiig provides the same inforiiiatioii as 
vg^roup'-administered standardized tests, but can do so using fewer ^ 
litems because the items administered are' selected ifor the ability of 
rt he individual student. Thus, tailored testing offers several 
mdvantages over jtraditional methods* Because individual tailored 
>^ests _a're_-iiot timed, anxiety is reduced and examinee motivation is 
^improved* EconoDjic advantages involve reduced test- time, immediate 
fiavailability of results, and reduced personiiel . reguireiaents. 
\Eff4c.tive tailoring occurs at the item level, involving two steps: 
^^estim^tlon; of the examinee • s ability from his or her previous 
/responses,-, and selection from an item bank of the item likely to 
^measure most effectively* Five methods of estimating item difficulty 
lor appropriateness are: (1) Robbins Monro procedure 
size; (3) flexiievel, which reguires a smaller item pool; (4). 
Bayesian procedures; and (5) stratified-adaptive or stradaptive 
procedures. -.(Author/GDC) ? 



/ 




> ' Reproductions supplied by EDRS/are the best that can be made 



ia^Ji>"-- . ^: r**^ «• ^ ^ ; . : — U.SrbEPARTMSHT'OiL^IBAl.TH,-^^ 

^'rS^y^-r, ' . EOUC4kTIONAWEUFAHE ';/.;-;-V-3 

'^v""*' '-- \ —'-r—r ■ . NATIONALINSTITUTEOF 

■V'?.:-^- ;' ■ • \ • . . THIS DOCUMEW HAS BEEN.- RepRD-^^^^^ 

XO :; -DUCED EXACTLY AS RECEIVED FROM; i 

■ ' W " V ' . THE PERSON OR ORGANIZATIOISJ ORIGIN-. . I 

• \ ' ATING IT. POINTS OF VIEW OR OPINIONS,::'. 

■<> J. \ - • STATED DO NOT NECESSARILY REPRE- ,: 

• • ,<r«Vfl , \ , SENTOPFICIALNATIONAL INSTITUTEOF 

^ ijZS . ' EDUCATION POSITION OR POLICY. ; ' 

^i^Ujf . A ■ \ . • ^ ' ■ 

Vv \ A Short and Simple Introduction to 

'^t: • Tailored Testing " 



- — • ...1 ' 

.'•PERMISSION TO REPRODUCE THIS' 
MATERIAL HAS BEEN GRANTED BY 

TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) AND 
, USERS OF THE ERIC SYSTEM." 

Lawrence M* Rudner 

Gallaudet College 
ihe Model Secondary School for the Deaf 
Kendall Green 
Washington, D.C* 20002 



Paper presented at the Eastern Educational Research Association 
Annual Meeting, Williamsburg/ 'VA, March, 1978. 



Ob 



iiilip 



This paper was sponsored in part with funds from P. L , 89-694 > 
However, the opinions and views expressed herein are those of 
.,,the author and do not necessarily reflect those of the Model 

^(Secondary SchdoX f or the Deaf or the U • S . / Department of Health, 

^Education and Welfare* / 



ERIC] 




- A Short and Simple Introduction to 

Tailored Testing •, ; . ' 

. Lawrence M. Rudner 

As probablistic models with sample independent, item ' 
■ descriptioh..(s) latent trait theories have several Appealing 
features: (1) the performance of an examinee of known ability 
on a given calibrated item can be predicted; (2) items which 
were calibp:ated" on ^different populations 6an readily be combihed 
.i« form an item-^jool of predictable characteristics; and (3) 
it'^it? clescriptions are independent of each other. These and other>\ 
features, along with the advent of readily accessible high speed 
computers, have-spawned a re-examination of test development and 
test usage procedures. In this symposium, Drs . Robertson; Rentz 
and Durov±c,,i^^^ how a latent trait model can 

be applied to the, practical issues of test development, equating, 
and, item bias. > ■ • 

Prior to the popularization of group testing procedures, most 
tests were individually administered and tailored to examinees. At 
th^ present time 'some of the largest developers and users cjf 
educational tests, i.e. the United States Department of Ddfense 
and the United States Civil Service Commission are re-examining 
this idea in light of the theoretical and practical benefits of 
latent_ trait models. 

The term "tailored testing" serves as a generic for any pro- 
cpdure by which particular items or. groups of ite m s are selected 
and administered to any ^individual examinee based jon an estimate 
of his or her ability. For example, an examinee's grade, , ^--'f-lj^i^j^ 

placement can serve as an initial estimate of an examinee • s •"^b^iit^l^^ 



and used to route an examinee to a set of items, e»g. a partica- 
lar level of an 'achievement test toa^ttery.- Such a procedure fits 
the definition in the broadest sensfe at the word. However, th^ 
procedure will misroute large numbers . of both- very able and lesser 
able students, and is not considered in most discussjions of 
tailored testing. 

Effective tailoring occurs at the item level; effective xxi 
the sense that exaininees across a wide, continuxim of ability are. 
administered items appropriate to their competence and are there- 
fore not misrouted. The task involves two basic iterated steps: 

1. Estimation of the examinee's ability from his or her 
previous responses. 

2* Selection of the item likely to measure most effectively^ 
at the presently estimated ability level tJLord, 1977). 

Prior to tailoring^ an item pool must be developed and item 

characteristics computed. Since the best items will be those 

whose difficulties most closely match the examinee'^s ability, all 

item-level tailoring schemes use an index of item difficulty 

(either the proportion of examinees responding correctly, the 

Rasch model easiness parameter, or ' the 2 and* 3- paramieter model 

location or difficulty parameter). Some schemes will also in- 

corporate item discrimination indices and/or guessing indices. 

V The. iriterested reader is referred to Angoff and Huddleston 

(1958) r Ferguson (ISegTT'Krathwohl and Huyser (19567"/ Linn, Rock 

. ■ ■• • • • • • 

and' Cleary (1968, 1972), Lord (1971a, 1971b), Olivier (1973) , 

OweiTilvgeg) , Vale (1975) and Weiss (1973, 1974) for further de- 

scriptions of the presented approaches and others and to Cleary, 

Linn and Rock (1968), Linn, Rock and Cleary " (1968 , 1972), Lord 



. (-197Ia) , urry (1977), Vale (1975)-, and Vale and Weiss (1-975) for 
\ soitie . evaluations . .v 
Kobblns Monro Procedures 

. In Robbins Monro procedures, the difficulty o£ the (i + l)st 
\ itCTi to be administered is determined by the rule 
b(i + 1) =.di (Mi - g) 4 b; 
wherie ' b^ is the difficulty of the.ith administered item, 
^ " d^ is a descending sequence of positive numbers, 

M± is the response tb item i.. (Mi =1 \Arhen correct, 

Mi = 0 when incorrect) 
g' is an offset parameter 
This proceduire i^ illustrated in Figure 1. The examinee 
took eight items. Harder items are administered after each cor- 
rect resjppi^se, easier after each incorrect response^ The difference 
in iteai difficulty between consecutively administered items decreases 
proportionately since dj. is ard'^escending sequence. The process 
continues converging pn the point at which the item" difficulty .iis 
equal to the examinee's ability and is terminated when a satisfac- . 

tory estimate is achieved. After n items are admini'stered, .the 

•I ■ . . • 

difficulty of the (n + 1) st item can then be used as the estimate 

' • " . • ' • ' .,■/•■•■. 

o£ the examinee ' s ability. 



Fixed Step Size 

Rather than using the decreasing step size governed by di 
in the Bobbins Monro procedures, di gan be held constant and the 
(i + .l)st item to be administered can be selected by - "/''^^ 

b(i ^. 1) ^ bi 4- d (Mi^H- g) • " 
Figure 2 illustrates this procedure.' This procedure^ can never 
truly- convergeV.on_^,t^^ point where the item difficulty equals the 
examinee's ability. The difficulty of the administered items 
will vase illate between being just above the examinee's ability ^ .^^^^ 
and just below it. The average difficulty of the administered 
item can be used as an estimate of the examinee ' s ability. 
Flexilevel - _ ^ - 

One practical . limitaJ;ion-with'T:h^ Robbins Monro and Fixed 
Step Size procedures is the need for extremely* large item pools. 
In theory, the later procedure will require n(n + l)/2 items, the 
former substantially more. ' ; 

The ' flexilevel procedure routes the examinee to the next 
less difficult tinadministered item following an incorrect respo^^ • 
Following a correct r*ssponse,~: the ^^e^^ is routed to the next . 

-more diff ic^ult unadministered item. Thus, the difficulty of the . ^ 

^ (i + l)th item is based on the available item pool.. The procedure 
is illustrated in Figure 3. After the item whose difficulty most 
closely matches the examinee's ability is administered/ the se- 
lected item oscillates between being substantially too easy and 
'Subs,tantially tdo difficult for the examinee. / . ~' '. ' 



. /, ■ FIGURE 2 ■ 

/A hypothetical Example of the 
Fixed Steps Tailored Testing Procedure 




Item Difficulty 




Baygsian Procedures 



merry 



Bayes theorem can be written as . 
P(A|B) = K • P(b|A) • P(A) 
where K is a constant 

PXAIB) denotes the probability, of A given B 

Substituting estimated ability for A and item response for B/ • 

the theorem is well suited to a measurement model which specifies 

.- . . — > ~ " • . ■ 

the probability of a correct response given an examinee's ability 

P (B|A) • Assuming a normal-prior , P{A)- is ndmal^^^^ and using a 

^ latent trait item response model,; an estimate of "examinee ability 

canjDe inferred "from each item response. After each item is ad- 

ministered,, the obtained posterior estimate of ability serves as 

the prior estima^te for the next item» Items are selected so as 

- to minimize a loss -function. When guiss sing is assumed to be . \ 

' void and item disi:riminations approximately equal (as with the- 

Rasch model) / itehs are selected such that the difference between 

item difficulty ajpd the - estimated ability is the minimxam possible 



wifchxn the item pool restraints. When guessing is a factor, the 
op^jnal difficulty is a bit less than the examinee^' s estimated 
ability. Testing is terminated when the standard error of 
estimation of ability is sufficiently small or when a maximum 
niOTiJer of items have been administered. 
Stra tif ied-adaptive (stradaptive) Procedures 

.Two of the main advantages of the Bayesian procedures are 
,tha;t prior > rion-- test estimates 'of ability can be used to allow 



that item information in addition 



!.ry?^t6^^item difficulty can be used. The main limitation is that 




• PIGURE U - ' • 

. A Hypothetical Example "of 
Stratified-AdaptiTe Tailored Testing 



1-3 

TO H. 
CD 



CO 

o 



13 

P O 







5 


+ 




+ 


11:6111 


^ 


item 




X 




2 








- +> 





6. 




item diffic\ilty. 



easxer 



"harder 



11 



items must be evaluated for their effectiveness. The strati- ■ 
' fied*-adaptive procedures seek to circumvent thi6 limitatioxi 
while retaining the benefits. . - . 

Figure 4 illustrates stratified--adaptive tailored testing. 
Items ^re arranged into strata of increasing difficulty. In \ 
this example, seven '^strata arei defined. Within each stratum, ^ ^ 
/ the items are! arranged in orde;r of their discriminability . 
N.T^ by administering the best discriminating item 

in the stratiam whose dif f iculty ^level most closely matches tlie 
/ prior estimate! of the examinee's ability^ or in the median strata 
when no prior is available. Afterr-a correct response, the pro- 
r^- . cedure routes the exaniin^ee to the best discriminating /unadminis'- 7 
A tered item in the. next more^diffidult- strata. Follow^^^ 
/ ■ ' incorrect relsponse, the examinee is administered the best 

discriminating unadministere/d item in the next less 'difficult - > 
strata. 
Conclusions 

One can safely say ; that the majority of standarSlized tests:, 
are administered to groups of students rather than are individ- 
ually administered, use! a paper and pencil format with separate ' 
.question and response sheets, present all items to all examinees 

in a fixed order, and havd^ set time limits.. Tailored testings ^ 

1- ^A^v 



iseeks to provide the same i^iformatioris^as such group tes-ts^ but 
by presenting fewer items, ^nce the items are tailored^to the i 
students ability and since fewer items are administered, tailored' 
'V tests can offer several • advantages over traditional assessment: ' ' 
Sfeins triiments ' . 



^<jl^<ii^^^^^^ 



In- a ^compreKensive 1973^f of the advantages and;; limitaS 

i Sibna tailored ^testing, Weis-Si. and Betz point out that the high 

degree of standardization Vf group tests introduces problems of 
:/ time limits/ answer sheets,, test compromise, administrator 
variables/ and item arrangements' as they affect whole groups '1 
and as they , affect certain subgroups of individuals. Individual 
■tests, on the other-hand, are untijne3_rr__,thus_minimiimizing anxiety 
and increasing^accuracy; better maintain motivation - since 

guessing and^ fatigue is reduced. . -~ 

An additional .factor which is not often discussedi is the ^ 
•managability of tailored testing. Test results aire immediately 
available, testing on demand is possible, ' trained test:' admirii- 
strators\ are not required, and examinee time is reduced.' As a 
consequenceV tailored tests can offer economic as well as 
psychometric advantages over conventional tests. 




13 



3; iji'^a* 



REFERENCES 



Angoff, Witt,- and Huddleston, E.M. The multi-level experimeht:;': 
A study of a two-level test syatem—for the College Board 
^ Aptitude Test. Princeton, N.J.: Educational Testing Service. 
1958. / 

Cleary, T.A. ,\VLinn, R.L.. and Rock, D.A. An exploratory study 
' of programmed tests. Educational and Ps ychological Measurfe- 

ment, 1968, i2£, 345-360^ ' 7 — — — — : — —~ ' 

.. ' . ■ . " ' / ' , . ■ ' ' . ' •. ■■ 

Ferguson,. R.L. . The development, implementation and evaluation of a. 
^ a computer assisted branched test for a prograin of indiyid- t 
ually prescribed instruction. Unpublished doctoral disseration'-*-^ 
-University of Pittsburg, 1969. r.,-,^' '']y^ '-.r 

Krathwohl,. D.R. and)\Huyser, R.J. The sequenUiaTlLTem^tes^^ 

American-PsychoIt5§)i3t , 1956, 2, 419. . " 7 '■' 

■ ■. : . ' ■ ■ y .... ' ' ' ■ • ■ ■ ' ,/ , • 

Linn, R.L. , Rpck, -p.A. and Cleary, T. A. The development and,/ • 
evaluation of several testing methods. Educational and , 
- Psychological Measurement , 1968, 29^, 125-146. " ~ ~ - ■ '.i f 

Linn, W^'^, Rock,. D.A." -and cieary,. T. A. ' Seguential tfesting for 
; dlchotomous decisions. Educational and Psvch oloqical Measure- 
/ment, \(^72, 32, 85-95.,. ' : /' . - \ j > [ 

Lqr^, F.M. £. "Rpbbins-Morirb^p^^^ 

■ r Educational and P sychological Measurement , 19.71a,- \31, 3-31. ' 

Lord, jF.M. The; self-scoring flexilevel test. Journal of 
.,r . Educational Measurement , .1971b, 8, 147-157."^ \ ' " 

Lord, F.M. Practical Applications 6^^ curve - 

theory. Journal of Education al Measurement, 1977. l4-(2), 

•117-138. ~ ~~ — -~- — : :.v-^ ■-; 

Olivier/ p. An overview of tailored testing. Unpublished 
doctoral preliminary examination paper ^ Florida State 

Owen^ R. J. A Bayesian approach to tailored testing. Princeton, . ' . 
N;J. : Educational Testing ServiceV 1969. / 

Urryv V.W., Tailored testing: A sucessful; application of latent^- 
V -trait theory^. Journal of Educational Measurement, 1977 / 14 
r (2) , .181-196. ■ ■ . ■- -.; ■ • . - : 

Vale, CD. Strategies of branching through an item pool. In 
D. J. Weiss (Ed) . Comp^uteriz^d adative trait measurements 
^P:r6blems and prospects, Minneapolis:' University of Minnesota' 
Psychometric Methods Programs, 1975.- J- • : v ^ f^^. ' 



REFERENCES 



Vale^^C.D. and Weiss, D.J. A simulation study of stradaptive 

ability testing, Minneapolis: University of Minnesota, 
^ Psychometric Methods Program, 1975, 



•Weiss, D.J. The stratified adaptive computerized ability test. 
Minneapolis / MN: University of Minnesota Psychometric 
Methods Program, . 1973. 



Weiss, D.J. Strategies of adaptive ability measurement,/ 

Minneapolis: University of Minnesota, Psychometric Methods 
Program, 1974. . V 



Weiss, D.J. and Betz Ability measurement: Gonyentional or 
adaptive? Minneapolis, MN: University of Minnesota, 
Psychometric Methods Program, 1973, \ 



" -It" 



15 



